Model Selection

Lightweight TTS

# Lightweight TTS

Argos 4b 0.2 Es

A text-to-speech model fine-tuned based on Orpheus-3B, supporting the conversion of text into natural and fluent speech

Speech Synthesis

Safetensors Spanish

Dia 1.6B is a model suitable for text-to-speech tasks, supporting multiple quantized versions and compatible with the TTS.cpp framework.

Speech Synthesis

Spark TTS 0.5B 4 6bit

Spark-TTS-0.5B-4-6bit is a text-to-speech model based on the MLX format, supporting both English and Chinese.

Speech Synthesis Supports Multiple Languages

Spark TTS 0.5B Bf16

Spark-TTS-0.5B-fp16 is a text-to-speech model based on the MLX format, supporting both English and Chinese.

Speech Synthesis Supports Multiple Languages

OuteTTS is a text-to-speech (TTS) model focused on the Turkish language, based on a 500M parameter scale, capable of converting Turkish text into natural speech.

Speech Synthesis Other

Canary Tts 0.5b

A Japanese TTS model trained on sarashina2.2‑0.5b‑instruct‑v0.1, supporting quality control via prompts

Speech Synthesis

PyTorch Supports Multiple Languages

A lightweight version of StyleTTS 2, focused on text-to-speech tasks, with multiple components removed to reduce complexity.

Speech Synthesis English

Orpheus 3b Kaya Q4 K M.gguf

A fine-tuned text-to-speech model based on Canopy Labs' pre-trained model, quantized for efficient inference

Speech Synthesis Supports Multiple Languages

Kokoro is a text-to-speech (TTS) model offering GGUF-encoded versions with dual phonemization support.

Speech Synthesis

3b De Ft Research Release 4bit

This is a German text-to-speech model based on MLX format conversion, supporting German language processing tasks.

Speech Synthesis

Transformers German

Orpheus Bangla Tts Gguf 8bit

This model is a proof-of-concept fine-tuned version of the Orpheus 3B TTS (Text-to-Speech) model to support Bengali.

Speech Synthesis Other

Orpheus Bangla Tts Gguf

Fine-tuned version of Orpheus 3B TTS model for Bengali, trained with 955 audio samples, suitable for experimental Bengali speech synthesis

Speech Synthesis Other

CiSiMi is an early prototype of a text-to-audio model designed for resource-constrained environments and capable of efficient operation on the CPU to achieve advanced speech synthesis.

Speech Synthesis English

Kokoro is an open-source text-to-speech model with 82 million parameters, delivering sound quality comparable to large models through a lightweight architecture while significantly improving speed and cost efficiency.

Speech Synthesis English

Kokoro 82M V1.1 Zh

Kokoro is an open-weight series of small yet powerful text-to-speech (TTS) models, now featuring data from 100 Chinese speakers sourced from professional datasets.

Speech Synthesis

Kokoro is an open-source TTS model with 82 million parameters, delivering audio quality comparable to larger models while offering significant speed advantages and cost efficiency.

Speech Synthesis English

Kokoro is an open-source text-to-speech model with 82 million parameters, achieving sound quality comparable to large models with a lightweight architecture while improving generation speed and reducing computational costs.

Speech Synthesis English

Kokoro 82M Light

A clone version based on StyleTTS2-LJSpeech, optimized for English text-to-speech tasks with reduced dependencies for simplified deployment.

Speech Synthesis English

ctranslate2-4you

Kokoro is a cutting-edge text-to-speech (TTS) model with 82 million parameters, released under Apache 2.0 license. Ranked #1 in TTS Spaces Arena, achieving higher Elo scores with fewer parameters and data.

Speech Synthesis English

Kokoro is an open-source text-to-speech (TTS) model with 82 million parameters, renowned for its lightweight architecture and high audio quality, while also being fast and cost-effective.

Speech Synthesis English

Parler Tts Mini V1 GGUF

GGUF format model file of Parler TTS Mini v1 for text-to-speech tasks, supporting the English language.

Speech Synthesis English

Indri 0.1 350m Tts

Indri is a novel, ultra-small, lightweight TTS model based on the Transformer architecture, supporting text-to-speech tasks in English and Hindi.

Speech Synthesis

Transformers Supports Multiple Languages

Japanese Parler Tts Large Bate

A Japanese text-to-speech model fine-tuned based on parler-tts-large-v1, capable of generating high-quality Japanese speech

Speech Synthesis

Transformers Japanese

Indri 0.1 124m Tts

Indri is an ultra-compact lightweight TTS model based on Transformer architecture, supporting English and Hindi text-to-speech tasks.

Speech Synthesis

Transformers Supports Multiple Languages

Parler Tts Mini V1.1

Parler-TTS Mini v1.1 is a lightweight text-to-speech model trained on 45,000 hours of audio data, capable of generating high-quality, natural-sounding speech with controllable features through simple text prompts.

Speech Synthesis

Transformers English

Parler Tts Tiny V1

Lightweight text-to-speech model trained on 45,000 hours of audio data, capable of controlling voice attributes through text prompts

Speech Synthesis

Transformers English

Parler Tts Mini V0.1

Parler-TTS Mini is a lightweight text-to-speech model trained on 10.5K hours of audio data, supporting voice feature control through text prompts.

Speech Synthesis

Transformers English

Urdu speech synthesis model fine-tuned on the fleurs dataset based on microsoft/speecht5_tts

Speech Synthesis

Transformers Other

Pak-Speech-Processing

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase